QuASAR: quantitative allele-specific analysis of reads

نویسندگان

  • Chris T. Harvey
  • Gregory A. Moyerbrailean
  • Gordon O. Davis
  • Xiaoquan Wen
  • Francesca Luca
  • Roger Pique-Regi
چکیده

MOTIVATION Expression quantitative trait loci (eQTL) studies have discovered thousands of genetic variants that regulate gene expression, enabling a better understanding of the functional role of non-coding sequences. However, eQTL studies are costly, requiring large sample sizes and genome-wide genotyping of each sample. In contrast, analysis of allele-specific expression (ASE) is becoming a popular approach to detect the effect of genetic variation on gene expression, even within a single individual. This is typically achieved by counting the number of RNA-seq reads matching each allele at heterozygous sites and testing the null hypothesis of a 1:1 allelic ratio. In principle, when genotype information is not readily available, it could be inferred from the RNA-seq reads directly. However, there are currently no existing methods that jointly infer genotypes and conduct ASE inference, while considering uncertainty in the genotype calls. RESULTS We present QuASAR, quantitative allele-specific analysis of reads, a novel statistical learning method for jointly detecting heterozygous genotypes and inferring ASE. The proposed ASE inference step takes into consideration the uncertainty in the genotype calls, while including parameters that model base-call errors in sequencing and allelic over-dispersion. We validated our method with experimental data for which high-quality genotypes are available. Results for an additional dataset with multiple replicates at different sequencing depths demonstrate that QuASAR is a powerful tool for ASE analysis when genotypes are not available. AVAILABILITY AND IMPLEMENTATION http://github.com/piquelab/QuASAR. CONTACT [email protected] or [email protected] SUPPLEMENTARY INFORMATION Supplementary Material is available at Bioinformatics online.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Genetics and population analysis QuASAR: quantitative allele-specific analysis of reads

Motivation: Expression quantitative trait loci (eQTL) studies have discovered thousands of genetic variants that regulate gene expression, enabling a better understanding of the functional role of non-coding sequences. However, eQTL studies are costly, requiring large sample sizes and genome-wide genotyping of each sample. In contrast, analysis of allele-specific expression (ASE) is becoming a ...

متن کامل

QuASAR-MPRA: accurate allele-specific analysis for massively parallel reporter assays

Motivation The majority of the human genome is composed of non-coding regions containing regulatory elements such as enhancers, which are crucial for controlling gene expression. Many variants associated with complex traits are in these regions, and may disrupt gene regulatory sequences. Consequently, it is important to not only identify true enhancers but also to test if a variant within an en...

متن کامل

WASP: allele-specific software for robust discovery of molecular quantitative trait loci

Allele-specific sequencing reads provide a powerful signal for identifying molecular quantitative trait loci (QTLs), however they are challenging to analyze and prone to technical artefacts. Here we describe WASP, a suite of tools for unbiased allele-specific read mapping and discovery of molecular QTLs. Using simulated reads, RNA-seq reads and ChIP-seq reads, we demonstrate that our approach h...

متن کامل

Hierarchical Analysis of Multi-mapping RNA-Seq Reads Improves the Accuracy of Allele-specific Expression

Abstract. Allele-specific expression (ASE) refers to the differential abundance of the allelic copies of a transcript. Direct RNA sequencing (RNA-Seq) can provide quantitative estimates of ASE for genes with transcribed polymorphisms. However, estimating ASE is challenging due to ambiguities in read alignment. Current approaches do not account for the hierarchy of multiple read alignments to ge...

متن کامل

Estimating the continuum of quasars using the articial neural networks

A lot of absorption lines are in the bluewards of Lyα emission line of quasar which is well-known as Lyαforest. Most of absorption lines in this forest belong to the Lyα absorption of the neutral hydrogen in the inter-galactic medium (IGM). For high redshift quasars and in the continuum with low and medium resolution, there are no many regions without absorption, so that, the quasar continuum i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 31 8  شماره 

صفحات  -

تاریخ انتشار 2015